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Abstract 

Background: The binding events of DNA-interacting proteins and their patterns can be extensively characterized 
by high density ChlP-chip tiling array data. The characteristics of the binding events could be different for different 
transcription factors. They may even vary for a given transcription factor among different interaction loci. The 
knowledge of binding sites and binding occupancy patterns are all very useful to understand the DNA-protein 
interaction and its role in the transcriptional regulation of genes. 

Results: In the view of the complexity of the DNA-protein interaction and the opportunity offered by high density 
tiled ChlP-chip data, we present a statistical procedure which focuses on identifying the interaction signal regions 
instead of signal peaks using moving window binomial testing method and deconvolving the patterns of 
interaction using peakedness and skewness scores. We analyzed ChlP-chip data of 4 different DNA interacting 
proteins including transcription factors and RNA polymerase in fission yeast using our procedure. Our analysis 
revealed the variation of binding patterns within and across different DNA interacting proteins. We present their 
utility in understanding transcriptional regulation from ChlP-chip data. 

Conclusions: Our method can successfully detect the signal regions and characterize the binding patterns in ChlP- 
chip data which help appropriate analysis of the ChlP-chip data. 



Background 

With the microarray technology rapidly advanced, tiling 
arrays have quickly become one of the most powerful 
tools in genome-wide investigations. High density tiling 
arrays [1] can be used to address many biological pro- 
blems such as transcriptome mapping, protein-DNA 
interaction mapping (ChlP-chip) and array CGH among 
others [2]. ChlP-chip [3], the focus of the paper, is a 
technique that combines chromatin immunoprecipita- 
tion (ChIP) with microarray technology (chip). It allows 
efficient, scalable and comprehensive identification of 
binding sites and profiles of DNA-binding proteins [4] . 
High density ChlP-chip tiling arrays not only help us 
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map the binding sites of a protein in the genome, but 
also allow us to better understand the binding events of 
the protein by clearly displaying the binding occupancy 
profiles. Several methods have been proposed to analyze 
the ChlP-chip data; for example, Joint Binding De-con- 
volution (JBD) [5] uses a probabilistic graphical model 
to improve spatial resolution of identification of the 
transcription factor binding sites. However it requires 
the DNA fragment length distribution which may not 
always be available. Its usefulness may be limited for 
high density tiling array since the resolution is consider- 
ably high. MPeak [6,7] fits a mixture of triangular basis 
to model the binding or interaction data. It ignores the 
complexity of binding event and only roughly charac- 
terizes the basic patterns for single and direct binding 
events. The more complex binding patterns from high 
density ChlP-chip may not be well explained using 
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mixture of a triangular basis. Model-based Analysis of 
Tiling-arrays (MAT) [8] reliably detects the signal 
enriched regions on Affymetrix tiling arrays by consider- 
ing the probe sequence and copy number of all probes 
on a single tiling array. The aim of those methods is to 
locate the binding sites only and not to characterize dif- 
ferent binding profiles present in different genomic 
regions. 

In the view of the complexity of the DNA-protein 
interaction, the opportunity offered by high-density tiled 
ChlP-chip data and the lack of methods to exploit 
wealth of information in the ChlP-chip data, we present 
a new statistical procedure to analyze high density 
ChlP-chip tiling array data to characterize protein-DNA 
interaction in terms of binding sites as well as the bind- 
ing profiles. First, we identify the enriched ChIP signal 
regions or protein binding occupancies using moving 
window binomial analysis and split the signal regions 
with multiple peaks into individual peak regions. Sec- 
ond, the signal regions are classified into two categories 
using peakedness test and analyzed separately. Third, 
the peak regions are analyzed to get the peak positions 
signifying the most probable binding sites, and using 
skewness assessment to improve the peak assignment to 
genes. The flat binding occupancies are processed to 
summarize their overall strength, their peak location is 
irrelevant as flat occupancies signify non-specific bind- 
ing to any particular locus within its range. 

In this article, we applied our procedure to analyze the 
high density ChlP-chip data of fission yeast (Schizosac- 
charomyces pombe) obtained from custom designed 
NimbleGen genome tiling arrays of ~380£ probes. We 
studied two stress-related transcriptional factors (TFs) 
Perl and Atfl with H 2 0 2 treatment [9], and one general 
transcription factor Tbpl (TATA box binding protein). 
We also included the RNA polymerase II large subunit, 
Rpbl (with and without H 2 0 2 treatment), which is used 
to indicate transcriptionally active genes of S. pombe 
(Tablel). We found that DNA-binding proteins show 
distinct patterns in the proportions of peak and flat 
binding profiles. Tbpl and stress-related TFs show more 
peak binding patterns indicating their location specific 
binding, and Rpbl present a large fraction of flat signal 



regions within protein-coding regions indicating its non- 
specific binding along gene body. 

Methods 

The tiling array has very high resolution and probes cover 
the whole genome, and ChIP procedure selects only pro- 
tein binding sites which are a small part of the genome. 
Therefore, only a very small proportion of probes in tiling 
array has the binding signal and majority of probes' signals 
will be close to the background. Hence we median cen- 
tered the log transformed data for further analysis. The 
proposed statistical procedure is illustrated in Figure 1. 

Moving window binomial analysis 

For any probe in location i, let x t (i = 1,..., n) denote the 
median centered log signal. We define the base thresh- 
old cMAD, c fold MAD (median absolute deviation) of 
all Xi's in the array, for x t (i = 1,..., «), then 



P = 



#{x t | x t > cMAD} 



is the probability that the signal of a probe passes the 
base threshold. Where #{xi\Xi > cMAD} is the total num- 
ber of probes whose signals pass the base threshold. 

Then p w (x t ) is defined as the probability that x t is clas- 
sified as signal by considering its neighborhood region 
{ Xj }'. W as the signal region, and can be computed by 
binomial testing as in the following equation, 



i=L V 



P'(l-P) 



2w+l-i 



where L is the number of probes above the base 
threshold in the region and w is the predefined 

half window size. 

We define a region { x, }" as a signal region if 



p w (xi)<a,(s<i<e) 

p w (x s _ 1 )>a,p w (x e+1 )>a 

e-s>4 

x s > cMAD, x e > cMAD, 



Table 1 The list of TFs studied 


Protein Description 


Condition 


Repeats 


Atfl transcription factor Atfl . Transcription factor required for sexual development and entry into stationary phase. 


H 2 0 2 


1 


Perl Transcription factor Pcr1. Involved in regulation of gene expression for sexual development. 


H 2 0 2 


1 


Tbpl TATA-binding protein (TBP). General transcription factor that functions at the core of the DNA-binding multiprotein 


Normal 


2 


factor TFIID. 






Rpbl RNA polymerase II large subunit Rpbl. 


Normal 


2 


Rpbl RNA polymerase II large subunit Rpbl. 


H 2 0 2 


2 
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Figure 1 The proposed statistical procedure for ChlP-chip data analysis. 
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where a is the p-value cutoff for the binomial test. 

After having identified signal regions, the task is to 
process them to identify binding sites and their profiles. 
The pre-requisite is to smooth the signals to reduce 
noise by smoothing the signal regions and their neigh- 
borhoods. Two different smoothing methods, multiple 
round moving average and median smoothing, are 
employed to reduce the noise in the data. Multiple 
round moving average smoothing method will retain the 
signal shape and the peak signal (local maximum) loci. 
This method has already been used in the ChlP-chip 
peak finder [10]. We use moving average method to 
split multi-peak signal regions, identify the peak loci and 
compute kurtosis and skewness scores of signal regions. 
The drawback of moving average is that it may destroy 
the boundaries of the signal regions, therefore we apply 
moving median smoothing method to characterize sig- 
nal regions. 

Splitting multi-peak signal regions 

Each signal region from moving window binomial analy- 
sis may contain multiple peaks (local maxima), indicat- 
ing multiple binding sites, and this will make the 
binding patterns more complex. Therefore, the regions 



with multiple peaks are split at the troughs of their pro- 
files. For doing this, we first assign each probe X[ of 
smoothed profile one of the y x . = {+, -, 0} to indicate 
whether the binding signals significantly increased, 
decreased or not significantly changed from the immedi- 
ately neighboring probes, 



0 



if Xj ^ d 

lf"Xj_|_^ Xj d 

otherwise, 



where d = MAD(^), <5 ; = \x t - x M \ for i = 1, 2,..., n - 
1. After removing all probes with "0", the region is split 
at the transition of signs from "-" to "+". After splitting, 
the signal regions of less than 4 probes are removed 
since those short signal regions generally have week sig- 
nals and cannot be characterized by peakedness and 
skewness assessment. 
Peak position identification 

Peak position of a signal region with one significant 
peak is determined after the moving average smoothing 
of the profile. The position of probe with the maximal 



Li et al. BMC Proceedings 201 1, 5(Suppl 2):S8 
http://www.biomedcentral.eom/1 753-6561 /5/S2/S8 



Page 4 of 1 1 



value in this region is defined as the position of the 
peak, the binding site. 

Peakedness and skewness assessment of signal regions 

If we consider the smoothed signal region {x,} e as a 
probability mass function, it can be assessed for peaked- 
ness using its kurtosis score (/<) using the formula 

y, e a- u ) 4 pj 

X ■ 

where » = X; 5 jp ' and P ' = ^y' ~ i' = s ' s + 1 '- - e ). The 

region is designated as having flat-shape when its K < 2. 
Thus, peaked regions are separated from flat regions. 

Similarly, for single peak regions, we used skewness 
score to test whether the peaks are left-skewed or right- 
skewed which indicates the binding orientation. This is 
very useful for transcription factor binding assignment 
to a single gene if there is a binding region in a bidirec- 
tional intergenic region (intergenic regions from diver- 
gent pair of genes). The skewness score (G) is calculated 
using the formula 



where the u and pj have same definition as in kurtosis 
definition. 

Results 

We used data from customized NimbleGen Tiling array 
designed for S. pombe (fission yeast) which has ~380£ 
50mer probes. They cover both strands of entire S. pombe 
genome based on the genome sequence from Wellcome 
Trust Sanger Institute (ftp://ftp.sanger.ac.uk/pub/yeast/ 
pombe/). In each strand, there is a I6bp interval between 
two consecutive probes. The probes on the reverse strand 
are placed so that they cover the gaps between consecutive 
probes of the pairing forward strand. Therefore, the 
probes have 17 'bp overlap with each other. The probes 
with multiple hits in the genome will have the same level 
of signals in multiple loci and they cannot distinguish 
which locus contributed to the signal. Therefore, the 
probes with more than 4 hits were removed from the ana- 
lysis, only -2% of probes have been removed. 

Transcription factor binding regions and binding patterns 

We analyzed ChlP-chip experiments of three DNA- 
binding proteins: two stress-related transcription factors 



Atfl and Perl, one general transcription factor Tbpl 
together with the RNA polymerase II large subunit 
Rpbl whose occupancy indicates transcriptionally active 
genes. Tbpl is a core subunit of the eukaryotic tran- 
scription factor TFIID which binds specifically to the 
TATA box. It contributes to load and release of RNA 
polymerase II at the transcription start sites (TSS). 
Furthermore, Tbpl is also a necessary component of 
RNA polymerase I and RNA polymerase III. Therefore, 
Tbpl is a good choice for binding pattern study. There 
are two replicates of ChlP-chip experiment for Tbpl 
and Rpbl, and one replicate for Atfl and Perl with 
H202 treatment [9]. The data of Tbpl and Rpbl will be 
available in the upcoming publications. The signal 
regions for each array were identified with the stringent 
criteria of cMAD= 2MAD at ^-value less than 0.001 
(a = 0.001). The number of signal regions is ~ 1600 
(~ 1300 before multi-peak splitting) for each replicate of 
Tbpl, ~ 1000 (-800 before multi-peak splitting) for 
Atfl/Pcrl, and for Rpbl it is ~ 800 (~ 500 before multi- 
peak splitting). We applied Dice coefficient to measure 
the similarity of signal regions between two repeats. 

s _ 2] AnB| 
\A\ + \B\' 

where |A| and \B\ is the total length of all signal 
regions of the first and second repeats, \A n B\ is the 
length of their overlapping regions. The coefficient for 
Tbpl is 0.921 which indicates that our results of Tbpl 
signal regions are highly reproducible. The coefficient 
for Rpbl is 0.743 which is still considerably high. 

The summary of the kurtosis score is shown in Fig- 
ure 2. More than half of the transcriptional factors 
(including Tbpl, Atfl and Perl) signal regions are 
sharp peaks and Rpbl (with and without H 2 0 2 treat- 
ment) signal regions are mostly flat. It is consistent 
with our knowledge about the characteristics of those 
DNA-binding proteins. Since the purpose of perform- 
ing a ChlP-chip experiment is to transform transcrip- 
tion factor binding sites into IP-enriched DNA, the 
specificity of protein-DNA binding finally determines 
the peaks of IP-enrichment. Therefore, due to the spe- 
cific binding at TSS position, Tbpl binding events 
were observed to result in many sharp peaked signal 
regions. Similarly, Atfl/Pcrl also display many sharp 
peaked signal regions which is consistent with the 
belief that they are to be active at the promoter 
regions of their target genes and show specific binding 
sites. However, Rpbl mostly presents flat occupancies 
in coding regions because of the function of Rpbl 
which controls transcription elongation and synthesizes 
messenger RNAs. 
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Figure 2 The signal regions summary for ChlP-chip datasets. 



To conduct more reliable analysis, we used the com- 
mon Tbpl bindings between replicated experiments. 
The common binding was defined as a binding region 
from the 1st experiment having the overlapping region 
from the 2nd experiment and the binding peaks of these 
two overlapping regions are both located inside of the 
intersection region. To measure the level of transcrip- 
tion activity of each gene, we detected the median level 
of Rpbl occupancies within a coding region and used 
the average of the replicated experiments as a measure 
of the transcription level. 

We investigated the kurtosis score distribution for all 
three transcription factors (Atfl, Perl and Tbpl) in 
three different genomic regions based on pombe genome 
TSS definition [11]: regions within 500bp of TSS ([-500, 
500]), the upstream intergenic regions beyond 500bp 
from TSS (<-500) and the downstream intragenic 
regions beyond 500bp from TSS (>500). As shown in 
Figure 3, higher kurtosis scores are observed in the 
regions of [-500, 500] and <-500 where are mostly pro- 
moter regions, it implies that transcriptional factors 
favor to bind these two regions. On the other hand, the 
TF signal regions falling into >500bp downstream of 
TSS (mostly are coding regions) have low kurtosis com- 
pared with the promoter regions (/7-value=1.620e-10 in 
Tbp and />-value<2.2e-16 in Atfl/Pcrl), and the binding 
signals in these regions are generally weak and unstable. 
The low kurtosis signals in coding regions is probably 



due to the interaction between TFs and RNA polymer- 
ase and the whole complex were involved in transcrip- 
tion elongation. The Rpbl binding regions mostly fall 
into coding regions and present the flat patterns with 
low kurtosis. 

TF binding affinity positively correlates with the gene 
transcription level 

After having identified all peaked signal regions from 
ChlP-chip data, the next step is to map those regions to 
genes. To our knowledge, there is no perfect method to 
accurately map binding sites to genes. Therefore, we 
limited our investigation to the peaks only from the uni- 
directional intergenic (IGU) regions which are easy to 
assign, i.e. assigned to the downstream genes, to reduce 
the risk of assignment errors. Furthermore, we filtered 
out peaks not within the upstream Ikb of any gene as it 
may be out of the promoter regions for S. pombe. The 
upstream peaks of any RNA genes have also been 
removed since Tbpl is also associated with RNA poly- 
merase I and RNA polymerase III. We chose the highest 
peak if one promoter region have multiple peaks. There 
are 379, 165 and 172 peaks of Tbpl, Atfl and Perl 
respectively in IGU regions (IGU peaks) after filtering. 
Then we investigated the relationship between TFs 
binding affinity and transcription levels. From Figure 4 
(A), we observed positive correlation between Tbpl 
binding affinity and transcription levels of the protein- 
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Figure 3 The kurtosis score distribution for TF (Atf 1 , Perl and Tbpl) occupancies in the three genomic regions: <-500 (the upstream intergenic 
regions beyond 500bp from T5S),[-500, 500] (regions within 500bp of TSS) and >500 (the downstream intragenic regions beyond 500bp from 
TSS). And kurtosis score distribution for Rpbl (with and without H 2 0 2 treatment put together) binding. 



coding genes (correlation is 0.55 and /?-value<2.2e-16). It 
implies that the highly transcribed genes tend to be 
initiated by high Tbpl binding affinity at their promo- 
ters. Similar to Tbpl, we also observed positive correla- 
tion between Atfl/Pcrl binding affinity and the Rpbl 
level change after H202 treatment (Figure 4(B), correla- 
tion is 0.30 and /?-value=1.110e-15). 

Skewness of Tbpl binding regions helps identifying Tbpl 
regulated genes 

The skewness scores of Tbpl binding regions also posi- 
tively correlate with Rpbl occupancy levels. We investi- 
gated the skewness for 379 IGU Tbpl peaks in 
unidirectional intergenic regions. Interestingly, the pat- 
terns of Tbpl binding skewed towards the direction of 
the immediate downstream gene which has transcrip- 
tion event. In other words, Tbpl signal region displays 
an extended tail into the ORF of its target gene. As 
shown in Figure 5(A), transcribed genes on the forward 
strand tend to display positive skewness for Tbpl bind- 
ing occupancies in their promoters regions, and the 
transcribed genes on the reverse strand preferentially 
show negative skewness. Another interesting observa- 
tion is that the absolute skewness declines with the 
decreasing Rpbl level of the downstream genes indicat- 
ing that Tbpl binding pattern is enough to predict 



whether the downstream genes are transcribed in that 
condition. Our explanation is that Tbpl may have per- 
sistent interaction with RNA polymerase in transcribing 
regions during the transition between transcription 
initiation and elongation, and the Tbpl occupancy 
would correlate with the transcription rate of the down- 
stream genes. 

To further test our observations and demonstrate the 
utility of the skewness of binding regions, we checked 
the correlation between skewness of Tbpl binding in 
the bi-directional promoters and Rpbl levels of the 
flanking genes. The assignment of binding sites in the 
bidirectional promoters is always a problem: some stu- 
dies assign them to both genes while the others assign 
them to the nearest gene. Here, we found skewness 
score may help us to get better assignment i.e. to iden- 
tify the genes activated by Tbpl. In order to be conser- 
vative, we removed bidirectional intergenic regions with 
one of the flanking genes is an RNA gene and also dis- 
carded those gene pairs having more than one Tbpl 
binding peaks. Finally there were 387 gene pairs used in 
the analysis. As shown in Figure 5(B), when the skew- 
ness scores are significantly positive (greater than 0.15) 
then the transcription level of the genes on the forward 
strands is clearly higher compared to the corresponding 
paired gene on the reverse strand, vice versa. 
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Figure 4 (A)Tbp1 binding affinity for VH(very high, Rpb1>2), H(high, 1.5<Rpb1<2), M(Median, 1<Rpb1<1.5) and L(Low, 0.5 <Rpb1< 1) Rpb1 level 
groups. (B)Atf1 and Perl binding affinity for VH(very high, ARpb1>2), H(high, 1<ARpbl<2), L(Low, 0<ARpb1<l), N(No, ARpbK 0) Rpbl level 
change after H2O2 treatment groups. Where ARpbl = Rpbl H o - Rpbl (Rpb1 H o and Rpbl are Rpbl level with and without H 2 0 2 treatment). 
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Figure 5 (A)The skewness score of Tbp1 binding in IGU (unidirectional intergenic) regions for VH(very high, Rpb1>2), H(high, 1.5<Rpb1<2), M 
(Median, 1<Rpb1<1.5) and L(Low, 0 <Rpbl< 1) Rpbl level groups. Red boxes are the IGU+ regions and green boxes are IGU- regions. (B)The 
Rpb1 score for Right-skewed, Symmetric and Left-skewed groups in IGB (bidirectional intergenic) regions. Red boxes are genes in forward strand 
and green boxes are genes in reverse strand. (C)The illustration of the IGU between forward strand genes (IGU+), reverse strand genes (IGU-) 
and IGB regions. Red boxes are genes in forward strand and green boxes are genes in reverse strand. 



When the binding pattern seems symmetric (between 
-0.15 and 0.15), there are no significant differences for 
transcription level between genes on the forward strand 
and the reverse strand. In addition, as shown in Figure 
5, the presence of symmetric pattern is associated with 
the low Rpbl level less than 0.5 i.e. there is almost zero 
transcription events for such low level transcriptions are 
rarely detectable with our Rpbl data under 2MAD cut- 
off. Therefore, the skewness of peaks could be helpful in 
assigning Tbpl binding to annotated features, particu- 
larly for the binding sites located in IGB (bidirectional 
intergenic) regions. Three examples of Tbpl binding 
patterns and Rpbl occupancies with two repeats in IGU 
+ (IGU between forward strand genes), IGU- (IGU 
between reverse strand genes) and IGB regions are 
shown in Figure 6. 

Atfl and Perl bindings also display more skewed pat- 
terns at promoters of the stress response genes (more 
than 2 fold Rpbl change after H2O2 treatment). As 



shown in Figure 7, the average skewness score of stress 
response gene bindings are positive (the sign of skew- 
ness score for negative strand gene bindings are chan- 
ged) and the average skewness score of other genes 
bindings equal to 0 (p-value=0. 007204). Four genes 
selected in top 20 Atfl/Pcrl-bound genes list [9] as 
examples are shown in Figure 8. 

Conclusions 

We developed a statistical procedure to characterize 
binding events of DNA-interacting proteins especially 
transcription factors from high density ChlP-chip tiling 
array data. The signal regions are detected using moving 
window binomial analysis and the binding events are 
characterized by two shape parameters, kurtosis and 
skewness. 

We applied our method to ChlP-chip data of TATA 
box binding protein (Tbpl), Atfl, Perl and Rpbl in S. 
pombe. We found that Tbpl tends to have more sharp 
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Figure 6 Three Examples of Tbp1 binding patterns and Rpb1 occupancies with two repeats in IGU+, IGU- and IGB regions. The blue boxes in 
first track indicate the gene ORF regions, and the vertical lines indicate the peak loci for Tbp1 binding. 
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Figure 8 Four examples of Atfl and Perl binding patterns and Rpbl (with and without H2O2 treatment) occupancies with two repeats. The 
blue boxes in first track indicate the gene ORF regions, and the vertical light blue bars indicate the stress response gene ORF regions. 



peaked occupancies than Rpbl that indicates our meth- 
ods can efficiently distinguish mostly localized DNA- 
protein bindings and the scattered DNA-protein interac- 
tions. We should notice even Tbpl also has flat occu- 
pancies, that maybe due to the interaction of Tbpl with 
RNA polymerase complex are maintained after the 
event of transcription initiation since other studies have 
reported there is no paused Pol II observed at the pro- 
moter regions in yeast [12]. It is also possible that Tbpl 
or other components of TFIID may contribute to tran- 
scription elongation. The specific wet lab experiments 
may be designed to validate this assumption. 

The two shape parameters of the signal regions, kurto- 
sis and skewness, can characterize the binding patterns 
and the associated biology. We used kurtosis to classify 
the regions into peak and flat regions. The peak regions 
mostly fall into the promoters and the flat regions are 
mostly very large and cover the coding regions. We 
have demonstrated that the binding pattern of the peak 
regions in promoter regions are skewed to the down- 
stream genes if they are transcribed, and hence the 
skewness can help us to predict whether the down- 
stream gene is transcribed and to assign the binding 
sites to genes in the bidirectional intergenic regions. 

Our method is applicable not only to ChlP-chip data, 
but can also be adopted to other datasets with similar 



goal. For example, ChlP-seq and ChlP-chip measure 
same signals but with different techniques. The binding 
patterns for ChlP-seq data should be similar to ChlP- 
chip data, so the peakedness and skewness assessment 
can be used for further analysis. Our method can be 
extended to other tiling array data if the patterns of the 
signal regions are important to the corresponding 
studies. 
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